Search Result

Select

Stochastic local search algorithm for solving exact satisfiability problem

Xingyu ZHAO, Xiaofeng WANG, Yi YANG, Lichao PANG, Lan YANG

Journal of Computer Applications 2024, 44 (3): 842-848. DOI: 10.11772/j.issn.1001-9081.2023030364

Abstract （156）

HTML （0）

PDF （906KB）（44）

Save

SATisfiability problem （SAT） is a NP complete problem， which is widely used in artificial intelligence and machine learning. Exact SATisfiability problem （XSAT） is an important subproblem of SAT. Most of the current research on XSAT is mainly at the theoretical level， but few efficient solution algorithms are studied， especially the stochastic local search algorithms with efficient verifiability. To address above problems and analyze some properties of both basic and equivalent coding formulas， a stochastic local search algorithm WalkXSAT was proposed for solving XSAT directly. Firstly， the random local search framework was used for basic search and condition determination. Secondly， the appropriate unsatisfiable scoring value of the text to which the variables belonged was added， and the variables that were not easily and appropriately satisfied were prioritized. Thirdly， the search space was reduced using the heuristic strategy of anti-repeat selection of flipped variables. Finally， multiple sources and multiple formats of examples were used for comparison experiments. Compared with ProbSAT algorithm， the number variable flips and the solving time of WalkXSAT are significantly reduced when directly solving the XSAT. In the example after solving the basic encoding transformation， when the variable size of the example is greater than 100， the ProbSAT algorithm is no longer effective， while WalkXSAT can still solve the XSAT in a short time. Experimental results show that the proposed algorithm WalkXSAT has high accuracy， strong stability， and fast convergence speed.

Table and Figures | Reference | Related Articles | Metrics

Select

Analysis of consistency between sensitive behavior and privacy policy of Android applications

Baoshan YANG, Zhi YANG, Xingyuan CHEN, Bing HAN, Xuehui DU

Journal of Computer Applications 2024, 44 (3): 788-796. DOI: 10.11772/j.issn.1001-9081.2023030290

Abstract （221）

HTML （6）

PDF （1850KB）（109）

Save

The privacy policy document declares the privacy information that an application needs to obtain， but it cannot guarantee that it clearly and fully discloses the types of privacy information that the application obtains. Currently， there are still deficiencies in the analysis of the consistency between actual sensitive behaviors of applications and privacy policies. To address the above issues， a method for analyzing the consistency between sensitive behaviors and privacy policies of Android applications was proposed. In the privacy policy analysis stage， a Bi-GRU-CRF （Bi-directional Gated Recurrent Unit Conditional Random Field） neural network was used and the model was incrementally trained by adding a custom annotation library to extract key information from the privacy policy declaration. In the sensitive behavior analysis stage， IFDS （Interprocedural， Finite， Distributive， Subset） algorithm was optimized by classifying sensitive API （Application Programming Interface） calls， deleting already analyzed sensitive API calls from the input sensitive source list， and marking already extracted sensitive paths. It ensured that the analysis results of sensitive behaviors matched the language granularity of the privacy policy description， reduced the redundancy of the analysis results and improved the efficiency of analysis. In the consistency analysis stage， the semantic relationships between ontologies were classified into equivalence， subordination， and approximation relationships， and a formal model for consistency between sensitive behaviors and privacy policies was defined based on these relationships. The consistency situations between sensitive behaviors and privacy policies were classified into clear expression and ambiguous expression， and inconsistency situations were classified into omitted expression， incorrect expression， and ambiguous expression. Finally， based on the proposed semantic similarity-based consistency analysis algorithm， the consistency between sensitive behaviors and privacy policies was analyzed. Experimental results show that， by analyzing 928 applications， with the privacy policy analysis accuracy of 97.34%， 51.4% of Android applications are found to have inconsistencies between the actual sensitive behaviors and the privacy policy declaration.

Table and Figures | Reference | Related Articles | Metrics

Select

User plagiarism identification scheme in social network under blockchain

Li LI, Chunyan YANG, Jiangwen ZHU, Ronglei HU

Journal of Computer Applications 2024, 44 (1): 242-251. DOI: 10.11772/j.issn.1001-9081.2023010031

Abstract （168）

HTML （8）

PDF （4508KB）（57）

Save

To address the problem of difficulty in identifying user plagiarism in social networks and to protect the rights of original authors while holding users accountable for plagiarism actions， a plagiarism identification scheme for social network users under blockchain was proposed. Aiming at the lack of universal tracing model in existing blockchain， a blockchain-based traceability information management model was designed to record user operation information and provide a basis for text similarity detection. Based on the Merkle tree and Bloom filter structures， a new index structure BHMerkle was designed. The calculation overhead of block construction and query was reduced， and the rapid positioning of transactions was realized. At the same time， a multi-feature weighted Simhash algorithm was proposed to improve the precision of word weight calculation and the efficiency of signature value matching stage. In this way， malicious users with plagiarism cloud be identified， and the occurrence of malicious behavior can be curbed through the reward and punishment mechanism. The average precision and recall of the plagiarism detection scheme on news datasets with different topics were 94.8% and 88.3%， respectively. Compared with multi-dimensional Simhash algorithm and Simhash algorithm based on information Entropy weighting （E-Simhash）， the average precision was increased by 6.19 and 4.01 percentage points respectively， the average recall was increased by 3.12 and 2.92 percentage points respectively. Experimental results show that the proposed scheme improves the query and detection efficiency of plagiarism text， and has high accuracy in plagiarism identification.

Table and Figures | Reference | Related Articles | Metrics

Select

Scene graph-aware cross-modal image captioning model

Zhiping ZHU, Yan YANG, Jie WANG

Journal of Computer Applications 2024, 44 (1): 58-64. DOI: 10.11772/j.issn.1001-9081.2022071109

Abstract （271）

HTML （7）

PDF （1879KB）（180）

Save

Aiming at the forgetting and underutilization of the text information of image in image captioning methods， a Scene Graph-aware Cross-modal Network （SGC-Net） was proposed. Firstly， the scene graph was utilized as the image’s visual features， and the Graph Convolutional Network （GCN） was utilized for feature fusion， so that the visual and textual features were in the same feature space. Then， the text sequence generated by the model was stored， and the corresponding position information was added as the textual features of the image， so as to solve the problem of text feature loss brought by the single-layer Long Short-Term Memory （LSTM） Network. Finally， to address the issue of over dependence on image information and underuse of text information， the self-attention mechanism was utilized to extract significant image information and text information and fuse then. Experimental results on Flickr30K and MS-COCO （MicroSoft Common Objects in COntext） datasets demonstrate that SGC-Net outperforms Sub-GC on the indicators BLEU1 （BiLingual Evaluation Understudy with 1-gram）， BLEU4 （BiLingual Evaluation Understudy with 4-grams）， METEOR （Metric for Evaluation of Translation with Explicit ORdering）， ROUGE （Recall-Oriented Understudy for Gisting Evaluation） and SPICE （Semantic Propositional Image Caption Evaluation） with the improvements of 1.1，0.9，0.3，0.7，0.4 and 0.3， 0.1， 0.3， 0.5， 0.6， respectively. It can be seen that the method used by SGC-Net can increase the model’s image captioning performance and the fluency of the generated description effectively.

Table and Figures | Reference | Related Articles | Metrics

Select

Multi-view clustering network with deep fusion

Ziyi HE, Yan YANG, Yiling ZHANG

Journal of Computer Applications 2023, 43 (9): 2651-2656. DOI: 10.11772/j.issn.1001-9081.2022091394

Abstract （490）

HTML （47）

PDF （1074KB）（368）

Save

Current deep multi-view clustering methods have the following shortcomings： 1） When feature extraction is carried out for a single view， only attribute information or structural information of the samples is considered， and these two types of information are not integrated. Thus， the extracted features cannot fully represent latent structure of the original data. 2） Feature extraction and clustering were divided into two separated processes， without establishing the relationship between them， so that the feature extraction process cannot be optimized by the clustering process. To solve these problems， a Deep Fusion based Multi-view Clustering Network （DFMCN） was proposed. Firstly， the embedding space of each view was obtained by combining autoencoder and graph convolution autoencoder to fuse attribute information and structure information of samples. Then， the embedding space of the fusion view was obtained through weighted fusion， and clustering was carried out in this space. And in the process of clustering， the feature extraction process was optimized by a two-layer self-supervision mechanism. Experimental results on FM （Fashion-MNIST）， HW （HandWritten numerals）， and YTF （YouTube Face） datasets show that the accuracy of DFMCN is higher than those of all comparison methods； and DFMCN has the accuracy increased by 1.80 percentage points compared with the suboptimal CMSC-DCCA （Cross-Modal Subspace Clustering via Deep Canonical Correlation Analysis） method on FM dataset， the Normalized Mutual Information （NMI） of DFMCN is increased by 1.26 to 14.84 percentage points compared to all methods except for CMSC-DCCA and DMSC （Deep Multimodal Subspace Clustering networks）. Experimental results verify the effectiveness of the proposed method.

Table and Figures | Reference | Related Articles | Metrics

Select

Color image information hiding algorithm based on style transfer process

Pan YANG, Minqing ZHANG, Yu GE, Fuqiang DI, Yingnan ZHANG

Journal of Computer Applications 2023, 43 (6): 1730-1735. DOI: 10.11772/j.issn.1001-9081.2022060953

Abstract （267）

HTML （16）

PDF （2861KB）（176）

Save

To solve the problem that information hiding algorithms based on neural style transfer do not solve the embedding problem of color images， a color image information hiding algorithm based on style transfer process was proposed. Firstly， the advantages of feature extraction of Convolutional Neural Network （CNN） were utilized to extract the semantic information of the carrier image， the style information of the style image and the feature information of the color image， respectively. Then， the semantic content of images and different styles were fused together. Finally the embedding of color image was completed while performing the style transfer of the carrier image through the decoder. Experimental results show that the proposed algorithm can integrate the secret image into the generated stylized image effectively， making the secret information embedding behavior indistinguishable from the style change behavior. Under the premise of maintaining the security of the algorithm， the proposed algorithm has the hiding capacity increased to 24 bpp， and the average values of Peak Signal-to-Noise Ratio （PSNR） and Structural SIMilarity （SSIM） reached 25.29 dB and 0.85 respectively， thereby solving the color image embedding problem effectively.

Table and Figures | Reference | Related Articles | Metrics

Select

Attribute reduction for high-dimensional data based on bi-view of similarity and difference

Yuanjiang LI, Jinsheng QUAN, Yangyi TAN, Tian YANG

Journal of Computer Applications 2023, 43 (5): 1467-1472. DOI: 10.11772/j.issn.1001-9081.2022081154

Abstract （206）

HTML （4）

PDF （464KB）（77）

Save

Concerning of the curse of dimensionality caused by too high data dimension and redundant information， a high-dimensional Attribute Reduction algorithm based on Similarity and Difference Matrix （ARSDM） was proposed. In this algorithm， on the basis of discernibility matrix， the similarity measure for samples in the same class was added to form a comprehensive evaluation of all samples. Firstly， the distances of samples under each attribute were calculated， and the similarity of same class and the difference of different classes were obtained based on these distances. Secondly， a similarity and difference matrix was established to form an evaluation of the entire dataset. Finally， attribute reduction was performed， i.e.， each column of the similarity and difference matrix was summed， the feature with the largest value was selected into the reduction in proper order， and the row vector of the corresponding sample pair was set to the zero vector. Experimental results show that compared with the classical attribute reduction algorithms DMG （Discernibility Matrix based on Graph theory）， FFRS （Fitting Fuzzy Rough Sets） and GBNRS （Granular Ball Neighborhood Rough Sets）， the average classification accuracy of ARSDM is increased by 1.07， 6.48， and 8.92 percentage points respectively under the Classification And Regression Tree （CART） classifier， and increased by 1.96， 11.96， and 12.39 percentage points under the Support Vector Machine （SVM） classifier. At the same time， ARSDM outperforms GBNRS and FFRS in running efficiency. It can be seen that ARSDM can effectively remove redundant information and improve the classification accuracy.

Table and Figures | Reference | Related Articles | Metrics

Select

Survey of multimodal pre-training models

Huiru WANG, Xiuhong LI, Zhe LI, Chunming MA, Zeyu REN, Dan YANG

Journal of Computer Applications 2023, 43 (4): 991-1004. DOI: 10.11772/j.issn.1001-9081.2022020296

Abstract （1517）

HTML （132）

PDF （5539KB）（1197）

PDF（mobile）（3280KB）（91）

Save

By using complex pre-training targets and a large number of model parameters， Pre-Training Model （PTM） can effectively obtain rich knowledge from unlabeled data. However， the development of the multimodal PTMs is still in its infancy. According to the difference between modals， most of the current multimodal PTMs were divided into the image-text PTMs and video-text PTMs. According to the different data fusion methods， the multimodal PTMs were divided into two types： single-stream models and two-stream models. Firstly， common pre-training tasks and downstream tasks used in validation experiments were summarized. Secondly， the common models in the area of multimodal pre-training were sorted out， and the downstream tasks of each model and the performance and experimental data of the models were listed in tables for comparison. Thirdly， the application scenarios of M6 （Multi-Modality to Multi-Modality Multitask Mega-transformer） model， Cross-modal Prompt Tuning （CPT） model， VideoBERT （Video Bidirectional Encoder Representations from Transformers） model， and AliceMind （Alibaba’s collection of encoder-decoders from Mind） model in specific downstream tasks were introduced. Finally， the challenges and future research directions faced by related multimodal PTM work were summed up.

Table and Figures | Reference | Related Articles | Metrics

Select

Collaborative filtering algorithm based on collaborative training and Boosting

Xiaohan YANG, Guosheng HAO, Xiehua ZHANG, Zihao YANG

Journal of Computer Applications 2023, 43 (10): 3136-3141. DOI: 10.11772/j.issn.1001-9081.2022101489

Abstract （180）

HTML （11）

PDF （1305KB）（115）

Save

Collaborative Filtering （CF） algorithm can realize personalized recommendation on the basis of the similarity between items or users. However， data sparsity has always been one of the challenges faced by CF algorithm. In order to improve the prediction accuracy， a CF algorithm based on Collaborative Training and Boosting （CFCTB） was proposed to solve the problem of sparse user-item scores. First， two CFs were integrated into a framework by using collaborative training， pseudo-labeled samples with high confidence were added to each other’s training set by the two CFs， and Boosting weighted training data were used to assist the collaborative training. Then， the weighted integration was used to predict the final user scores， and the accumulation of noise generated by pseudo-labeled samples was avoided effectively， thereby further improving the recommendation performance. Experimental results show that the accuracy of the proposed algorithm is better than that of the single models on four open datasets. On CiaoDVD dataset with the highest sparsity， compared with Global and Local Kernels for recommender systems （GLocal-K）， the proposed algorithm has the Mean Absolute Error （MAE） reduced by 4.737%. Compared with ECoRec （Ensemble of Co-trained Recommenders） algorithm， the proposed algorithm has the Root Mean Squared Error （RMSE） decreased by 7.421%. The above rasults verify the effectiveness of the proposed algorithm.

Table and Figures | Reference | Related Articles | Metrics

Select

Power data analysis based on financial technical indicators

An YANG, Qun JIANG, Gang SUN, Jie YIN, Ying LIU

Journal of Computer Applications 2022, 42 (3): 904-910. DOI: 10.11772/j.issn.1001-9081.2021030447

Abstract （295）

HTML （7）

PDF （785KB）（88）

Save

Considering the lack of effective trend feature descriptors in existing methods， financial technical indicators such as Vertical Horizontal Filter （VHF） and Moving Average Convergence/Divergence （MACD） were introduced into power data analysis. An anomaly detection algorithm and a load forecasting algorithm using financial technical indicators were proposed. In the proposed anomaly detection algorithm， the thresholds of various financial technical indicators were determined based on statistics， and then the abnormal behaviors of user power consumption were detected using threshold detection. In the proposed load forecasting algorithm， 14 dimensional daily load characteristics related to financial technical indicators were extracted， and a Long Shot-Term Memory （LSTM） load forecasting model was built. Experimental results on industrial power data of Hangzhou City show that the proposed load forecasting algorithm reduces the Mean Absolute Percentage Error （MAPE） to 9.272%， which is lower than that of Autoregressive Integrated Moving Average （ARIMA）， Prophet and Support Vector Machine （SVM） algorithms by 2.322， 24.175 and 1.310 percentage points， respectively. The results show that financial technical indicators can be effectively applied to power data analysis.

Table and Figures | Reference | Related Articles | Metrics

Select

Fall detection algorithm based on joint point features

Jianrong CAO, Yaqin ZHU, Yuting ZHANG, Junjie LYU, Hongjuan YANG

Journal of Computer Applications 2022, 42 (2): 622-630. DOI: 10.11772/j.issn.1001-9081.2021040618

Abstract （515）

HTML （19）

PDF （1203KB）（239）

Save

In order to solve the problems of large amount of network computation and difficulty in distinguishing falling-like behaviors in fall detection algorithms， a fall detection algorithm based on joint point features was proposed. Firstly， based on the current advanced CenterNet algorithm， a Depthwise Separable Convolution-CenterNet （DSC-CenterNet） joint point detection algorithm was proposed to accurately detect human joint points and obtain joint point coordinates while reducing the amount of backbone network computation. Then， based on the joint point coordinates and prior knowledge of the human body， the spatial and temporal features expressing the fall behavior were extracted as the joint point features. Finally， the joint point feature vector was input into the fully connected layer and processed by Sigmoid classifier to output two categories： fall or non-fall， thereby achieving the fall detection of human targets. Experimental results on UR Fall Detection dataset show that the proposed algorithm has the average accuracy of fall detection under different state changes reached 98.00%， the accuracy of distinguishing falling-like behaviors reached 98.22% and the fall detection speed of 18.6 frame/s. Compared with the algorithm of the original CenterNet combining with joint point features， the algorithm of DSC-CenterNet combining with joint point features has the average detection accuracy increased by 22.37%. The improved speed can effectively meet the realtime requirement of the human fall detection tasks under surveillance video. This algorithm can effectively increase fall detection speed and accurately detect the fall state of human body， which further verifies the feasibility and efficiency of fall detection algorithm based on joint point features in the video fall behavior analysis.

Table and Figures | Reference | Related Articles | Metrics

Select

Feature selection algorithm for imbalanced data based on pseudo-label consistency

Yiheng LI, Chenxi DU, Yanyan YANG, Xiangyu LI

Journal of Computer Applications 2022, 42 (2): 475-484. DOI: 10.11772/j.issn.1001-9081.2021050957

Abstract （398）

HTML （22）

PDF （921KB）（115）

Save

Aiming at the problem that most algorithms of granular computing ignore the class-imbalance of data， a feature selection algorithm integrating pseudo-label strategy was proposed to deal with class-imbalanced data. Firstly， to investigate feature selection from class-imbalanced data conveniently， the sample consistency and dataset consistency were re-defined， and the corresponding greedy forward search algorithm for feature selection was designed. Then， the pseudo-label strategy was introduced to balance the class distribution of the data. By integrating the learned pseudo-label of a sample into consistency measure， the pseudo-label consistency was defined to estimate the features of the class-imbalanced dataset. Finally， an algorithm for Pseudo-Label Consistency based Feature Selection （PLCFS） for class-imbalanced data was developed based on the preservation of the pseudo-label consistency measure for the class-imbalanced dataset. Experimental results indicate that the proposed PLCFS has the performance only lower than max-Relevancy and Min-Redundancy （mRMR） algorithm， but outperforms Relief algorithm and algorithm for Consistency-based Feature Selection （CFS）.

Table and Figures | Reference | Related Articles | Metrics

Select

Review of peer grading technologies for online education

Jia XU, Jing LIU, Ge YU, Pin LYU, Panyuan YANG

Journal of Computer Applications 2022, 42 (12): 3913-3923. DOI: 10.11772/j.issn.1001-9081.2021101709

Abstract （281）

HTML （13）

PDF （1682KB）（164）

Save

With the rapid development of online education platforms represented by Massive Open Online Courses （MOOC）， how to evaluate the large-scale subjective question assignments submitted by platform learners is a big challenge. Peer grading is the mainstream scheme for the challenge， which has been widely concerned by both academia and industry in recent years. Therefore， peer grading technologies for online education were survyed and analyzed. Firstly， the general process of peer grading was summarized. Secondly， the main research results of important peer grading activities， such as grader allocation， comment analysis， abnormal peer grading information detection and processing， true grade estimation of subjective question assignments， were explained. Thirdly， the peer grading functions of representative online education platforms and published teaching systems were compared. Finally， the future development trends of peer grading was summed up and prospected， thereby providing reference for people who are engaged in or intend to engage in peer grading research.

Table and Figures | Reference | Related Articles | Metrics

Select

Trusted integrity verification scheme of cloud data without bilinear pairings

Wenyong YUAN, Xiuguang LI, Ruifeng LI, Zhengge YI, Xiaoyuan YANG

Journal of Computer Applications 2022, 42 (12): 3769-3774. DOI: 10.11772/j.issn.1001-9081.2021101780

Abstract （232）

HTML （6）

PDF （1345KB）（76）

Save

Focusing on the malicious cheating behaviors of Third Party Auditor （TPA） in cloud audit， a trusted cloud auditing scheme without bilinear pairings was proposed to support the correct judgment of the behaviors of TPA. Firstly， the pseudo-random bit generator was used to generate random challenge information， which ensured the reliability of the challenge information generated by TPA. Secondly， the hash value was added in the process of evidence generation to protect the privacy of user data effectively. Thirdly， in the process of evidence verification， the interactive process between users and TPA results was added. The data integrity was checked and whether TPA had completed the audit request truthfully or not was judged according to the above results. Finally， the scheme was extended to realize batch audit of multiple data. Security analysis shows that the proposed scheme can resist substitution attack and forgery attack， and can protect data privacy. Compared with Merkle-Hash-Tree based Without Bilinear PAiring （MHT-WiBPA） audit scheme， the proposed scheme has close time for verifying evidence， and the time for generating labels reduced by about 49.96%. Efficiency analysis shows that the proposed scheme can achieve lower computational cost and communication cost on the premise of ensuring the credibility of audit results.

Table and Figures | Reference | Related Articles | Metrics

Select

High-capacity reversible data hiding in encrypted videos based on histogram shifting

Pei CHEN, Shuaiwei ZHANG, Yangping LIN, Ke NIU, Xiaoyuan YANG

Journal of Computer Applications 2022, 42 (11): 3633-3638. DOI: 10.11772/j.issn.1001-9081.2021101722

Abstract （257）

HTML （2）

PDF （1692KB）（102）

Save

Aiming at the low embedding capacity of Reversible Data Hiding （RDH） in encrypted videos， a high-capacity RDH scheme in encrypted videos based on histogram shifting was proposed. Firstly， 4×4 luminance intra-prediction mode and the sign bits of Motion Vector Difference （MVD） were encrypted by stream cipher， and then a two-dimensional histogram of MVD was constructed， and （0，0） symmetric histogram shifting algorithm was designed. Finally，（0，0） symmetric histogram shifting algorithm was carried out in the encrypted MVD domain to realize separable RDH in encrypted videos. Experimental results show that the embedding capacity of the proposed scheme is increased by 263.3% on average compared with the comparison schemes， the average Peak Signal-to-Noise Ratio （PSNR） of encrypted video is less than 15.956 dB， and the average PSNR of decrypted video with secret can reach more than 30 dB. The proposed scheme effectively improves the embedding capacity and is suitable for more types of video sequences.

Table and Figures | Reference | Related Articles | Metrics

Select

Fake news detection method based on blockchain technology

Shengjia GONG, Linlin ZHANG, Kai ZHAO, Juntao LIU, Han YANG

Journal of Computer Applications 2022, 42 (11): 3458-3464. DOI: 10.11772/j.issn.1001-9081.2021111885

Abstract （413）

HTML （13）

PDF （1557KB）（178）

Save

Fake news not only leads to misconceptions and damages people's right to know the truth， but also reduces the credibility of news websites. In view of the occurrence of fake news in news websites， a fake news detection method based on blockchain technology was proposed. Firstly， the smart contract was invoked to randomly assign reviewers for the news for determining the authenticity of the news. Then， the credibility of the review results was improved by adjusting the number of reviewers and ensuring the number of effective reviewers. At the same time， the incentive mechanism was designed with rewards distributed according to the reviewers' behaviors， and the reviewers' behaviors and rewards were analyzed by game theory. In order to gain the maximum benefit， the reviewers' behaviors should be honest. An auditing mechanism was designed to detect malicious reviewers to improve system security. Finally， a simple blockchain fake news detection system was implemented by using Ethereum smart contract and simulated for fake news detection， and the results show that the accuracy of news authenticity detection of the proposed method reaches 95%， indicating that the proposed method can effectively prevent the release of fake news.

Table and Figures | Reference | Related Articles | Metrics

Select

Survey of event extraction

Chunming MA, Xiuhong LI, Zhe LI, Huiru WANG, Dan YANG

Journal of Computer Applications 2022, 42 (10): 2975-2989. DOI: 10.11772/j.issn.1001-9081.2021081542

Abstract （930）

HTML （139）

PDF （3054KB）（551）

Save

The event that the user is interested in is extracted from the unstructured information， and then displayed to the user in a structured way， that is event extraction. Event extraction has a wide range of applications in information collection， information retrieval， document synthesis， and information questioning and answering. From the overall perspective， event extraction algorithms can be divided into four categories： pattern matching algorithms， trigger lexical methods， ontology-based algorithms， and cutting-edge joint model methods. In the research process， different evaluation methods and datasets can be used according to the related needs， and different event representation methods are also related to event extraction research. Distinguished by task type， meta-event extraction and subject event extraction are the two basic tasks of event extraction. Among them， meta-event extraction has three methods based on pattern matching， machine learning and neural network respectively， while there are two ways to extract subjective events： based on the event framework and based on ontology respectively. Event extraction research has achieved excellent results in single languages such as Chinese and English， but cross-language event extraction still faces many problems. Finally， the related works of event extraction were summarized and the future research directions were prospected in order to provide guidelines for subsequent research.

Table and Figures | Reference | Related Articles | Metrics

Select

Low-rank representation subspace clustering method based on Hessian regularization and non-negative constraint

Lili FAN, Guifu LU, Ganyi TANG, Dan YANG

Journal of Computer Applications 2022, 42 (1): 115-122. DOI: 10.11772/j.issn.1001-9081.2021071181

Abstract （245）

HTML （11）

PDF （661KB）（183）

Save

Focusing on the issue that the Low-Rank Representation （LRR） subspace clustering algorithm does not consider the local structure of the data and may cause the loss of local similar information during learning， a Low-Rank Representation subspace clustering algorithm based on Hessian regularization and Non-negative constraint （LRR-HN） was proposed to explore the global and local structure of the data. Firstly， the good speculative ability of Hessian regularization was used to maintain the local manifold structure of the data， so that the local topological structure of the data was more expressive. Secondly， considering that the obtained coefficient matrix often has positive and negative values， and the negative values often have no practical significance， non-negative constraints were introduced to ensure the effectiveness of the model solution and make it more meaningful in the description of the local structure of the data. Finally， the low-rank representation of the global structure of the data was sought by minimizing the nuclear norm， so as to cluster high-dimensional data better. In addition， an effective algorithm for solving LRR-HN was designed by using the linearized alternating direction method with adaptive penalty， and the proposed algorithm was evaluated by ACcuracy （AC） and Normalized Mutual Information （NMI） on some real datasets. In the experiments with clusters number 20 on ORL dataset， compared with LRR algorithm， LRR-HN has the AC and NMI increased by 11% and 9.74% respectively， and compared with Adaptive Low-Rank Representation （ALRR） algorithm， LRR-HN has the AC and NMI increased by 5% and 1.05% respectively. Experimental results show that the LRR-HN has great improvement in AC and NMI compared with some existing algorithms， and has the excellent clustering performance.

Table and Figures | Reference | Related Articles | Metrics

Select

Adaptive deep graph convolution using initial residual and decoupling operations

Jijie ZHANG, Yan YANG, Yong LIU

Journal of Computer Applications 2022, 42 (1): 9-15. DOI: 10.11772/j.issn.1001-9081.2021071289

Abstract （481）

HTML （41）

PDF （648KB）（277）

Save

The traditional Graph Convolutional Network （GCN） and many of its variants achieve the best effect in the shallow layers， and do not make full use of higher-order neighbor information of nodes in the graph. The subsequent deep graph convolution models can solve the above problem， but inevitably generate the problem of over-smoothing， which makes the models impossible to effectively distinguish different types of nodes in the graph. To address this problem， an adaptive deep graph convolution model using initial residual and decoupling operations， named ID-AGCN （model using Initial residual and Decoupled Adaptive Graph Convolutional Network）， was proposed. Firstly， the node’s representation transformation as well as feature propagation was decoupled. Then， the initial residual was added to the node’s feature propagation process. Finally， the node representations obtained from different propagation layers were combined adaptively， appropriate local and global information was selected for each node to obtain node representations containing rich information， and a small number of labeled nodes were used for supervised training to generate the final node representations. Experimental result on three datasets Cora， CiteSeer and PubMed indicate that the classification accuracy of ID-AGCN is improved by about 3.4 percentage points， 2.3 percentage points and 1.9 percentage points respectively， compared with GCN. The proposed model has superiority in alleviating over-smoothing.

Table and Figures | Reference | Related Articles | Metrics

Select

Transfer learning based on graph convolutional network in bearing service fault diagnosis

Xueying PENG, Yongquan JIANG, Yan YANG

Journal of Computer Applications 2021, 41 (12): 3626-3631. DOI: 10.11772/j.issn.1001-9081.2021060974

Abstract （365）

HTML （8）

PDF （561KB）（287）

Save

Deep learning methods are widely used in bearing fault diagnosis， but in actual engineering applications， real service fault data during bearing service are not easily collected and lack of data labels， which is difficult to train adequately. Focused on the difficulty of bearing service fault diagnosis， a transfer learning model based on Graph Convolutional Network （GCN） in bearing service fault diagnosis was proposed. In the model， the fault knowledge was learned from artificially simulated damage fault data with sufficient data and transferred to real service faults， so as to improve the diagnostic accuracy of service faults. Specifically， the original vibration signals of artificially simulated damage fault data and service fault data were converted into the time-frequency maps with both time and frequency information through wavelet transform， and the obtained maps were input into graph convolutional layers for learning， so as to effectively extract the fault feature representations in the source and target domains. Then the Wasserstein distance between the data distributions of source domain and target domain was calculated to measure the difference between two data distributions， and a fault diagnosis model that can diagnose bearing service faults was constructed by minimizing the difference in data distribution. A variety of different tasks were designed for experiments with different bearing failure data sets and different operating conditions. Experimental results show that the proposed model has the ability to diagnose bearing service faults and also can be transferred from one working condition to another， and perform fault diagnosis between different component types and different working conditions.

Table and Figures | Reference | Related Articles | Metrics

Select

Extractive and abstractive summarization model based on pointer-generator network

Wei CHEN, Yan YANG

Journal of Computer Applications 2021, 41 (12): 3527-3533. DOI: 10.11772/j.issn.1001-9081.2021060899

Abstract （294）

HTML （9）

PDF （562KB）（82）

Save

As a hot issue in natural language processing， summarization generation has important research significance. The abstractive method based on Seq2Seq （Sequence-to-Sequence） model has achieved good results， however， the extractive method has the potential of mining effective features and extracting important sentences of articles， so it is a good research direction to improve the abstractive method by using extractive method. In view of this， a fusion model of abstractive method and extractive method was proposed. Firstly， incorporated with topic similarity， TextRank algorithm was used to extract significant sentences from the article. Then， an abstractive framework based on the Seq2Seq model integrating the semantics of extracted information was designed to implement the summarization task； at the same time， pointer-generator network was introduced to solve the problem of Out-Of-Vocabulary （OOV）. Based on the above steps， the final summary was obtained and verified on the CNN/Daily Mail dataset. The results show that on all the three indexes ROUGE-1， ROUGE-2 and ROUGE-L， the proposed model is better than the traditional TextRank algorithm； meanwhile， the effectiveness of fusing extractive method and abstractive method in the field of summarization is also verified.

Table and Figures | Reference | Related Articles | Metrics

Select

Many-objective particle swarm optimization algorithm based on hyper-spherical fuzzy dominance

TAN Yang, TANG Dequan, CAO Shoufu

Journal of Computer Applications 2019, 39 (11): 3233-3241. DOI: 10.11772/j.issn.1001-9081.2019040710

Abstract （375）

PDF （1319KB）（306）

Save

With the increase of the dimension of the problem to be optimized, Many-objective Optimization Problem (MAOP) will form a huge target space, resulting in a sharp increase of the proportion of non-dominant solutions. And the selection pressure of evolutionary algorithms is weakened and the efficiency of evolutionary algorithms for solving MAOP is reduced. To solve this problem, a Particle Swarm Optimization (PSO) algorithm using hyper-spherical dominance relationship to reduce the number of non-dominant solutions was proposed. The fuzzy dominance strategy was used to maintain the selection pressure of the population to MAOP. And the distribution of individuals in the target space was maintained by the selection of global extremum and the maintenance of external files. The simulation results on standard test sets DTLZ and WFG show that the proposed algorithm has better convergence and distribution when solving MAOP.

Reference | Related Articles | Metrics

Select

Personal relation extraction based on text headline

YAN Yang, ZHAO Jiapeng, LI Quangang, ZHANG Yang, LIU Tingwen, SHI Jinqiao

Journal of Computer Applications 2016, 36 (3): 726-730. DOI: 10.11772/j.issn.1001-9081.2016.03.726

Abstract （765）

PDF （754KB）（721）

Save

In order to overcome the non-person entity's interference, the difficulties in selection of feature words and muti-person influence on target personal relation extraction, this paper proposed person judgment based on decision tree, relation feature word generation based on minimum set cover and statistical approach based on three-layer sentence pattern rules. In the first step, 18 features were extracted from attribute files of China Conference on Machine Learning (CCML) competition 2015, C4.5 decision was used as the classifier, then 98.2% of recall rate and 92.6% of precision rate were acquired. The results of this step were used as the next step's input. Next, the algorithm based on minimum set cover was used. The feature word set covers all the personal relations as the scale of feature word set is maintained at a proper level, which is used to identify the relation type in text headline. In the last step, a method based on statistics of three-layer sentence pattern rules was used to filter small proportion rules and specify the sentence pattern rules based on positive and negative proportions to judge whether the personal relation is correct or not. The experimental result shows the approach acquires 82.9% in recall rate and 74.4% in precision rate and 78.4% in F1-measure, so the proposed method can be applied to personal relation extraction from text headlines, which helps to construct personal relation knowledge graph.

Reference | Related Articles | Metrics

Select

Chromosomal translocation-based Dynamic evolutionary algorithm

TAN Yang, NING Ke, CHEN Lin

Journal of Computer Applications 2015, 35 (9): 2584-2589. DOI: 10.11772/j.issn.1001-9081.2015.09.2584

Abstract （441）

PDF （863KB）（341）

Save

When traditional binary-coded evolutionary algorithms are applied to optimize functions, the mutual interference between different dimensions would prevent effective restructuring of some low-order modes. A new evolutionary algorithm, called Dynamic Chromosomal Translocation-based Evolutionary Algorithm (CTDEA), was proposed based on cytological findings. This algorithm simulated the structuralized process of organic chromosome inside cells by constructing gene matrixes, and realized modular translocations of homogeneous chromosomes on the basis of gene matrix, in order to maintain the diversity of populations. Moreover, the individual fitness-based population-dividing method was adopted to safeguard elite populations, ensure competitions among individuals and improve the optimization speed of the algorithm. Experimental results show that compared with existing Genetic Algorithm (GA) and distribution estimation algorithms, this evolutionary algorithm greatly improves the population diversity, keeping the diversity of populations around 0.25. In addition, this algorithm shows obvious advantages in accuracy, stability and speed of optimization.

Reference | Related Articles | Metrics

Select

Performance evaluation method based on objective weight determination for data center network

NAN Yang, CHEN Lin

Journal of Computer Applications 2015, 35 (11): 3055-3058. DOI: 10.11772/j.issn.1001-9081.2015.11.3055

Abstract （484）

PDF （764KB）（645）

Save

For large-scale data center network, how to monitor the network effectively, discover the bottleneck of network performance and potential point of failure, provide support for optimization of network performance becomes the new research subject. However, there are many factors which affect the network performance, and there are differences in the influence of performance factors. How to give an accurate performance evaluation has been a difficult problem. To solve these problems, a network performance evaluation index system was proposed in this paper. On this basis, a method for evaluating the network performance of data centers based on objective weight (PE-OWD) was put forward. By using the method of objective weight determination, the dynamic calculation of the weights of the indexes was adopted. And using the data normalization method based on historical parameters, a perfect network performance evaluation model was established. For the Tianhe2 real network environment, the performance indexes of the network equipment were evaluated, and the validity of the method for evaluating the network performance was verified.

Reference | Related Articles | Metrics

Select

Data acquisition and transmission system for γ-ray industrial computed tomography

GAO Fuqiang, CHEN Chunjiang, LAN Yang, AN Kang

Journal of Computer Applications 2015, 35 (1): 276-278. DOI: 10.11772/j.issn.1001-9081.2015.01.0276

Abstract （438）

PDF （634KB）（471）

Save

In order to meet the requirements of high speed and multi-channel of data acquisition and transmission for γ-ray industrial Computed Tomography (CT), the system based on User Datagram Protocol (UDP) with Field-Programmable Gate Array (FPGA) controlling was designed. This system increased FPGA counting unit, so more channels could be used for data collection. Main control was based on FPGA as the core, which used UDP protocol and was implemented by Verilog programming. Then, data was transmitted to upper computer for image reconstruction by Ethernet interface chip. The upper computer interface and mutual communication with the underlying transmission circuit realized by VC ++ 6.0 programming. The experimental results indicate that, in the 100 Mb/s full-duplex mode, the network utilization rate can reach 93%, and transmission speed is 93 Mb/s (11.625 MB/s), and the upper computer can receive data correctly in a long distance. So, it can satisfy the system requirements of rapid speed and long distance for γ-ray industrial CT.

Reference | Related Articles | Metrics

Select

Real-time trajectory simplification algorithm of moving objects based on minimum bounding sector

WANG Xinran YANG Zhiying

Journal of Computer Applications 2014, 34 (8): 2409-2414. DOI: 10.11772/j.issn.1001-9081.2014.08.2409

Abstract （435）

PDF （981KB）（1128）

Save

To improve the efficiency of the application of trajectory data, reduce communication cost and computational overhead of mobile terminal, the raw trajectory data of moving objects which were collected by Global Positioning System (GPS) equipment must be simplified. A method based on Minimum Bounding Sector (MBS) for real-time trajectory simplification of moving objects was proposed. The algorithm is different from those which approximated the original trajectory with a polygonal line. It adopted sector to predict the moving range, which could estimate and simplify the original trajectory. In order to control simplification error efficiently, the identical polar radius error metric method was proposed based on the characteristics of sector angle and distance. In addition, the affect of GPS positioning error on the simplified algorithm was discussed. The experimental results show that, the simplified trajectory of the proposed algorithm is efficient and stable, it has smaller error (no more than 20% of the error threshold) in comparison with the original trajectory and has good fault tolerant ability on GPS positioning error.

Reference | Related Articles | Metrics

Select

Gaussian weighted multiple classifiers for object tracking

LAN Yuandong DENG Huifang CAI Zhaoquan YANG Xiong

Journal of Computer Applications 2014, 34 (8): 2394-2398. DOI: 10.11772/j.issn.1001-9081.2014.08.2394

Abstract （309）

PDF （977KB）（363）

Save

When the appearance of an object changes rapidly, most of the weak learners can not capture the new feature distributions which will lead to tracking failure. In order to deal with that issue, a Gaussian weighted online multiple classifiers algorithm boosting for object tracking was proposed. This algorithm defined one weak classifier which included a simple visual feature and a threshold for each domain problem. Gaussian weighting function was introduced to weigh each weak classifier's contribution in a particular sample, therefore the tracking performance was improved through joint learning of multiple classifiers. In the process of object tracking, online multiple classifiers can not only simultaneously determine the location and estimate the pose of the object, but also successfully learn multi-modal appearance models and track an object under rapid appearance changes. The experimental results show that, after a short initial training phase, the average tracking error rate of the proposed algorithm is 12.8%, which proves that the tracking performance has enhanced significantly.

Reference | Related Articles | Metrics

Select

Lightweight privacy-preserving data aggregation algorithm

CHEN Yanli FU Chunjuan XU Jian YANG Geng

Journal of Computer Applications 2014, 34 (8): 2336-2341. DOI: 10.11772/j.issn.1001-9081.2014.08.2336

Abstract （372）

PDF （986KB）（447）

Save

Private data is easy to suffer from the attacks about data confidentiality, integrity and freshness. To resolve this problem, a secure data aggregation algorithm based on homomorphic Hash function was proposed, called HPDA (High-Efficiency Privacy Preserving Data Aggregation) algorithm. Firstly, it used homomorphic encryption scheme to provide data privacy-preserving. Secondly, it adopted homomorphic Hash function to verify the integrity and freshness of aggregated data. Finally, it reduced the communication overhead of the system by improved ID transmission mechanism. The theoretical analyses and experimental simulation results show that HPDA can effectively preserve data confidentiality, check data integrity, satisfy data freshness, and bring low communication overhead.

Reference | Related Articles | Metrics

Select

Uncertainty data processing by fuzzy support vector machine with fuzzy similarity measure and fuzzy mapping

WANG Yufan LIANG Gongqian YANG Jing

Journal of Computer Applications 2014, 34 (7): 2066-2070. DOI: 10.11772/j.issn.1001-9081.2014.07.2066

Abstract （162）

PDF （697KB）（382）

Save

In order to improve the processing ability for uncertainty data using the traditional Fuzzy Support Vector Machine (FSVM), FSVM with fuzzy similarity measure and high dimensional space fuzzy mapping was proposed. Firstly, by using Gregson similarity measure, the fuzzy similarity measure function was established, which was effective to explain the uncertainty information. And then, using the theory of mapping and Mercer, fuzzy similarity kernel learning was formulated and used in the algorithm of the FSVM. Finally, this algorithm was used to the modeling of the material removal rate in the rotary ultrasonic machining with uncertainty data. Compared to the results using traditional FSVM methods, the current approach can better process uncertainty data with less operation steps. And the proposed method has higher accuracy in processing uncertainty data with lower computational complexity.

Reference | Related Articles | Metrics